智能论文笔记

Graph-Coupled Oscillator Networks

T. Konstantin Rusch , Benjamin P. Chamberlain , James Rowbottom , Siddhartha Mishra , Michael M. Bronstein

分类：机器学习 | (统计)机器学习

2022-02-04

我们提出了图形耦合振荡器网络（GraphCon），这是一个新颖的图形学习框架。它基于普通微分方程（ODE）的二阶系统的离散化，该系统建模了非线性控制和阻尼振荡器网络，并通过基础图的邻接结构结合。我们的框架的灵活性允许作为耦合函数任何基本的GNN层（例如卷积或注意力），通过该函数，通过该函数通过该函数通过该函数通过该函数通过所提出的ODES的动力学来构建多层深神经网络。我们将GNN中通常遇到的过度厚度问题与基础ode的稳态稳定性联系起来，并表明零二核能能量稳态对于我们提出的ODE不稳定。这表明所提出的框架减轻了过度厚度的问题。此外，我们证明GraphCon减轻了爆炸和消失的梯度问题，以促进对多层GNN的训练。最后，我们证明我们的方法在各种基于图形的学习任务方面就最先进的方法提供了竞争性能。

translated by 谷歌翻译

Unsupervised Machine Learning for Exploratory Data Analysis of Exoplanet Transmission Spectra

Konstantin T. Matchev , Katia Matcheva , Alexander Roman

分类：机器学习

2022-01-07

过渡光谱是一种有力的工具，可以解码额外行星气氛的化学成分。在本文中，我们专注于分析来自过渡外部的光谱数据的无监督技术。我们展示了i）的方法，清洁和验证数据，ii）基于概述统计（位置和变异性估计），iii）的初始探索数据分析，iii）探索和量化数据中的现有相关性，IV）预处理和线性变换数据到其主要成分，v）维数减少和歧管学习，vi）聚类和异常检测，vii）可视化和数据的解释。为了说明所提出的无监督方法，我们使用众所周知的公共基准数据集的合成传输谱。我们表明光谱数据中存在高度的相关性，该数据呼叫适当的低维表示。我们探索了许多不同的技术，用于减少这种维数，在概要统计，主成分等方面确定几种合适的选择。我们在主成分基础上发现有趣的结构，即与不同化学制度相对应的明确定义的分支。底层大气。我们证明，这些分支可以以完全无监督的方式用K-Means聚类算法成功恢复。我们倡导第三个主成分的光谱数据的三维表示，以揭示数据中的现有结构并快速表征行星的化学类。

translated by 谷歌翻译

Analytical Modelling of Exoplanet Transit Specroscopy with Dimensional Analysis and Symbolic Regression

Konstantin T. Matchev , Katia Matcheva , Alexander Roman

分类：机器学习

2021-12-22

新发现的外部肌肉的物理特性和大气化学成分通常从其过渡光谱推断出从辐射转移的复杂数模型获得的。或者，简单的分析表达式为相关的大气过程提供了富有洞察力的物理直觉。深入学习的革命已经开辟了直接推导出这样的分析结果的门，直接与拟合数据的计算机算法。作为概念证明，我们成功地证明了在通用热木星外部基因族的过渡半径的合成数据上使用符号回归，以得出相应的分析公式。作为预处理步骤，我们使用尺寸分析来识别变量的相关无量纲组合，并减少独立输入的数量，从而提高了符号回归的性能。尺寸分析还允许我们在数学上得出并适当地参加输入大气参数中最通用的变性家族，这通过过渡光谱影响开发族气氛的表征。

translated by 谷歌翻译

Flexible Supervised Autonomy for Exploration in Subterranean Environments

Harel Biggie , Eugene R. Rush , Danny G. Riley , Shakeeb Ahmad , Michael T. Ohradzansky , Kyle Harlow , Michael J. Miles , Daniel Torres , Steve McGuire , Eric W. Frew

分类：机器人

2023-01-02

While the capabilities of autonomous systems have been steadily improving in recent years, these systems still struggle to rapidly explore previously unknown environments without the aid of GPS-assisted navigation. The DARPA Subterranean (SubT) Challenge aimed to fast track the development of autonomous exploration systems by evaluating their performance in real-world underground search-and-rescue scenarios. Subterranean environments present a plethora of challenges for robotic systems, such as limited communications, complex topology, visually-degraded sensing, and harsh terrain. The presented solution enables long-term autonomy with minimal human supervision by combining a powerful and independent single-agent autonomy stack, with higher level mission management operating over a flexible mesh network. The autonomy suite deployed on quadruped and wheeled robots was fully independent, freeing the human supervision to loosely supervise the mission and make high-impact strategic decisions. We also discuss lessons learned from fielding our system at the SubT Final Event, relating to vehicle versatility, system adaptability, and re-configurable communications.

translated by 谷歌翻译

Muse: Text-To-Image Generation via Masked Generative Transformers

Huiwen Chang , Han Zhang , Jarred Barber , AJ Maschinot , Jose Lezama , Lu Jiang , Ming-Hsuan Yang , Kevin Murphy , William T. Freeman , Michael Rubinstein

分类：计算机视觉 | 人工智能 | 机器学习

2023-01-02

We present Muse, a text-to-image Transformer model that achieves state-of-the-art image generation performance while being significantly more efficient than diffusion or autoregressive models. Muse is trained on a masked modeling task in discrete token space: given the text embedding extracted from a pre-trained large language model (LLM), Muse is trained to predict randomly masked image tokens. Compared to pixel-space diffusion models, such as Imagen and DALL-E 2, Muse is significantly more efficient due to the use of discrete tokens and requiring fewer sampling iterations; compared to autoregressive models, such as Parti, Muse is more efficient due to the use of parallel decoding. The use of a pre-trained LLM enables fine-grained language understanding, translating to high-fidelity image generation and the understanding of visual concepts such as objects, their spatial relationships, pose, cardinality etc. Our 900M parameter model achieves a new SOTA on CC3M, with an FID score of 6.06. The Muse 3B parameter model achieves an FID of 7.88 on zero-shot COCO evaluation, along with a CLIP score of 0.32. Muse also directly enables a number of image editing applications without the need to fine-tune or invert the model: inpainting, outpainting, and mask-free editing. More results are available at https://muse-model.github.io

translated by 谷歌翻译

Spectral Bandwidth Recovery of Optical Coherence Tomography Images using Deep Learning

Timothy T. Yu , Da Ma , Jayden Cole , Myeong Jin Ju , Mirza F. Beg , Marinko V. Sarunic

分类：人工智能 | 计算机视觉

2023-01-02

Optical coherence tomography (OCT) captures cross-sectional data and is used for the screening, monitoring, and treatment planning of retinal diseases. Technological developments to increase the speed of acquisition often results in systems with a narrower spectral bandwidth, and hence a lower axial resolution. Traditionally, image-processing-based techniques have been utilized to reconstruct subsampled OCT data and more recently, deep-learning-based methods have been explored. In this study, we simulate reduced axial scan (A-scan) resolution by Gaussian windowing in the spectral domain and investigate the use of a learning-based approach for image feature reconstruction. In anticipation of the reduced resolution that accompanies wide-field OCT systems, we build upon super-resolution techniques to explore methods to better aid clinicians in their decision-making to improve patient outcomes, by reconstructing lost features using a pixel-to-pixel approach with an altered super-resolution generative adversarial network (SRGAN) architecture.

translated by 谷歌翻译

BSA -- Bi-Stiffness Actuation for optimally exploiting intrinsic compliance and inertial coupling effects in elastic joint robots

Dennis Ossadnik , Mehmet C. Yildirim , Fan Wu , Abdalla Swikir , Hugo T. M. Kussaba , Saeed Abdolshah , Sami Haddadin

分类：机器人

2022-12-30

Compliance in actuation has been exploited to generate highly dynamic maneuvers such as throwing that take advantage of the potential energy stored in joint springs. However, the energy storage and release could not be well-timed yet. On the contrary, for multi-link systems, the natural system dynamics might even work against the actual goal. With the introduction of variable stiffness actuators, this problem has been partially addressed. With a suitable optimal control strategy, the approximate decoupling of the motor from the link can be achieved to maximize the energy transfer into the distal link prior to launch. However, such continuous stiffness variation is complex and typically leads to oscillatory swing-up motions instead of clear launch sequences. To circumvent this issue, we investigate decoupling for speed maximization with a dedicated novel actuator concept denoted Bi-Stiffness Actuation. With this, it is possible to fully decouple the link from the joint mechanism by a switch-and-hold clutch and simultaneously keep the elastic energy stored. We show that with this novel paradigm, it is not only possible to reach the same optimal performance as with power-equivalent variable stiffness actuation, but even directly control the energy transfer timing. This is a major step forward compared to previous optimal control approaches, which rely on optimizing the full time-series control input.

translated by 谷歌翻译

DRG-Net: Interactive Joint Learning of Multi-lesion Segmentation and Classification for Diabetic Retinopathy Grading

Hasan Md Tusfiqur , Duy M. H. Nguyen , Mai T. N. Truong , Triet A. Nguyen , Binh T. Nguyen , Michael Barz , Hans-Juergen Profitlich , Ngoc T. T. Than , Ngan Le , Pengtao Xie

分类：计算机视觉

2022-12-30

Diabetic Retinopathy (DR) is a leading cause of vision loss in the world, and early DR detection is necessary to prevent vision loss and support an appropriate treatment. In this work, we leverage interactive machine learning and introduce a joint learning framework, termed DRG-Net, to effectively learn both disease grading and multi-lesion segmentation. Our DRG-Net consists of two modules: (i) DRG-AI-System to classify DR Grading, localize lesion areas, and provide visual explanations; (ii) DRG-Expert-Interaction to receive feedback from user-expert and improve the DRG-AI-System. To deal with sparse data, we utilize transfer learning mechanisms to extract invariant feature representations by using Wasserstein distance and adversarial learning-based entropy minimization. Besides, we propose a novel attention strategy at both low- and high-level features to automatically select the most significant lesion information and provide explainable properties. In terms of human interaction, we further develop DRG-Net as a tool that enables expert users to correct the system's predictions, which may then be used to update the system as a whole. Moreover, thanks to the attention mechanism and loss functions constraint between lesion features and classification features, our approach can be robust given a certain level of noise in the feedback of users. We have benchmarked DRG-Net on the two largest DR datasets, i.e., IDRID and FGADR, and compared it to various state-of-the-art deep learning networks. In addition to outperforming other SOTA approaches, DRG-Net is effectively updated using user feedback, even in a weakly-supervised manner.

translated by 谷歌翻译

Ontology-based Context Aware Recommender System Application for Tourism

Vitor T. Camacho , José Cruz

分类：机器学习

2022-12-29

In this work a novel recommender system (RS) for Tourism is presented. The RS is context aware as is now the rule in the state-of-the-art for recommender systems and works on top of a tourism ontology which is used to group the different items being offered. The presented RS mixes different types of recommenders creating an ensemble which changes on the basis of the RS's maturity. Starting from simple content-based recommendations and iteratively adding popularity, demographic and collaborative filtering methods as rating density and user cardinality increases. The result is a RS that mutates during its lifetime and uses a tourism ontology and natural language processing (NLP) to correctly bin the items to specific item categories and meta categories in the ontology. This item classification facilitates the association between user preferences and items, as well as allowing to better classify and group the items being offered, which in turn is particularly useful for context-aware filtering.

translated by 谷歌翻译

Policy Optimization to Learn Adaptive Motion Primitives in Path Planning with Dynamic Obstacles

Brian Angulo , Aleksandr Panov , Konstantin Yakovlev

分类：机器人

2022-12-29

This paper addresses the kinodynamic motion planning for non-holonomic robots in dynamic environments with both static and dynamic obstacles -- a challenging problem that lacks a universal solution yet. One of the promising approaches to solve it is decomposing the problem into the smaller sub problems and combining the local solutions into the global one. The crux of any planning method for non-holonomic robots is the generation of motion primitives that generates solutions to local planning sub-problems. In this work we introduce a novel learnable steering function (policy), which takes into account kinodynamic constraints of the robot and both static and dynamic obstacles. This policy is efficiently trained via the policy optimization. Empirically, we show that our steering function generalizes well to unseen problems. We then plug in the trained policy into the sampling-based and lattice-based planners, and evaluate the resultant POLAMP algorithm (Policy Optimization that Learns Adaptive Motion Primitives) in a range of challenging setups that involve a car-like robot operating in the obstacle-rich parking-lot environments. We show that POLAMP is able to plan collision-free kinodynamic trajectories with success rates higher than 92%, when 50 simultaneously moving obstacles populate the environment showing better performance than the state-of-the-art competitors.

translated by 谷歌翻译